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ABSTRACT : 

The reliability of supplying data stored in a plurality of different memories to 
different users is enhanced by (a) dividing each of the memories into primary and 
secondary sections, (b) partitioning the data into successive blocks and (c) 
storing the blocks of data in sequence in respective ones of the primary sections. 
Then storing in sequence the blocks of data that have been stored in the primary 
section of one of the memories in respective ones of the secondary sections of the 
other ones of said disks. 
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TITLE: Method for employing doubly striped mirroring of data and reassigning data 
streams scheduled to be supplied by failed disk to respective ones of remaining 
disks 



Drawing Description Text (5) : 

FIG. 3 illustrates a schedule that a storage node of FIG. 1 may create for the 
unloading of data blocks stored in an associated disk; 

Detailed Description Text (3) : 

With that in mind, in an illustrative embodiment of the invention, multiplexer 75, 
FIG. 1, which may be, for example, an Asynchronous Transport Mode switch, receives 
data in the form of packets from storage nodes 50-1 through 50-N, respectively. 
Multiplexer 75 then routes the received packets over respective virtual circuit 
connections via communications paths 75-1 through 75-N to their intended 
destinations, in which a packet may carry a segment of a video program and in which 
the content of the video program may be doubly striped and mirrored in accord with 
the invention across storage disks 40-11 through 40-j4, as will be explained below 
in detail. Because of such striping, a storage node 50i may supply a packet 
carrying a respective portion of a segment of a particular video program to 
multiplexer 75 via a virtual channel assigned to the segments that are stored in 
storage disks 40k associated with storage node 50i and that are to be delivered to 
the same destination (subscriber), where N, j, i and k>l. Thus, segments of the 
content of a video program striped across the storage disks may be supplied to 
multiplexer 75 via respective ones of a number of different virtual data channels 
assigned for that purpose. Once such contents have been so delivered, then the 
assigned virtual data channels may be reassigned for some other but similar purpose 
(or left idle) . 

Detailed Description Text (5) : 

Processor 25, more particularly, first determines if it has the resources available 
to service the request. That is, if a disk is able to support n data streams and 
the content of a program is striped over N disks, then nN streams (users) can be 
supported from the array of N disks, where n and N>1. Thus, server 100 may service 
the user's request if the current number of data streams that are being supplied by 
the array of disks 40-11 through 40-j4 to multiplexer 75 is less than nd. Assuming 
that is the case, then processor 25 communicates with multiplexer 75 to obtain 
channel assignments that the storage nodes may use to sequentially transmit their 
respective video segments that form the requested video to multiplexer . 75 . Included 
in such communication is a request to establish a virtual connection between each 
of the assigned channels and a channel of one of the communications paths 76-1 
through 76-N that will be used to route the program to the user. In addition, 
processor 25 establishes a schedule that the storage nodes 50i are to follow in the 
delivery of the segments to the user via multiplexer 75. Processor 25 then supplies 
the assigned channels and schedule to each of the storage nodes 50i via Local Area 
Network (LAN) 30, as discussed below. The schedule includes the identity of the 
storage node, e.g., 50-1, and associated disk, e.g., disk 40-11, storing the 
initial (first) segment of the requested program. The schedule also includes the 
time of day that the first segment of the program is to be unloaded from disk and 
supplied to multiplexer 75. Processor 25 then forms a message containing, inter 
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alia, the (a) schedule established for running the program, (b) identity of the 
program, (c) identity of the storage node containing the first segment of the 
program, (d) channel that has been assigned to the node whose address is contained 
in the header of the message, and (e) time of day that the first segment is to be 
transmitted to the user. Processor 25 then sends the message to the storage node 
identified in the message via LAN 30. Processor 25 then updates the message so that 
it is applicable to a next one of the storage nodes and sends the message to that 
node via LAN 30. In an illustrative embodiment of the invention, processor 25 sends 
the message first to the storage node 50i containing the first segment of the 
requested program. Processor 25 sequentially sends the message to the remaining 
storage nodes based on the order of their respective addresses and updates the 
message following each such transmission. 

Detailed Description Text (6) : 

Since the storage nodes 50i are similar to one another, a discussion of one such 
node equally pertains to the other storage nodes . It is thus seen that storage node 
50-1, FIG. 1, includes microprocessor 52-3 for communicating with host 25, in which 
the communications are typically directed to setting up a schedule for the delivery 
of respective video segments stored in buffer 52-2 to multiplexer 75 via the 
assigned data channel and for controlling the storing and unloading of the segments 
from buffer 52-2. Buffer 52-2 represents a dual buffer arrangement in which 
microprocessor 52-3 unloads segments of video from respective ones of the disks 40- 
11 through 40-14 and stores the unloaded segments in a first one of the dual 
buffers 52-2 during a first cycle. During that same cycle, e.g., a time period of 
one second, microprocessor 52-3 in turn unloads portions of respective segments of 
respective videos stored in the second one of the dual buffers 52-2 during a 
previous cycle. Adapter 52-1 reads a packet from the buffer and transmits the 
packet to multiplexer 75 via communication path 51-1 (which may be, e.g., optical 
fiber) and the channel assigned for that particular purpose. OC3 adapter 52-1 
implements the well-known OC3 protocol for interfacing a data terminal, e.g., 
storage node 50-1, with an optical fiber communications path, e.g., path 51-1. SCSI 
adapter 52-4, on the other hand, implements a Small Computer System Interface 
between microprocessor 52-3 and its associated disks 40-11 through 40-14 via bus 
45-1. 

Detailed Description Text (9) : 

For the sake of simplicity and clarity, only disks 65-1 through 65-4 of a server 
60, FIG. 2, are shown. (The other elements of the server 60, e.g., storage node, 
host processor, etc., are not shown.) Assume that the host processor has divided a 
video program into a plurality of sequential data blocks (segments) DO through Di 
for storage in the disks, in accord with the invention. To do so, the host 
processor stores the data blocks DO through Di in sequential order (i.e., round- 
robin order) in the primary sections P of disks 65-1 through 65-4, respectively, as 
shown in FIG. 2. That is, the host processor stripes the data blocks across the 
primary sections of disks 65-1 through 65-4. The host processor then makes a backup 
copy of the contents of the primary sections of the disks 65i, for example, 65-1. 
It does this, in accord with an aspect of the invention, by striping the data 
blocks forming such contents across the secondary sections of the other disks, 
i.e., disks 65-2 through 65-4, in round-robin order. For example, it is seen from 
FIG. 2 that the first three data blocks DO, D4 and D8 stored in disk 65-1 are also 
stored in the secondary (backup) sections S of disks 65-2 through 65-4, 
respectively. This is also the case for data blocks D12, D16 and D20 as well as the 
remaining contents of the primary section of disk 65-1. Similarly, the contents of 
the primary section of disk 65-2 are striped across disks 65-3, 65-4 and 65-1 in 
that order. For example, it is also seen from FIG. 2 that data blocks DI, D5 and D9 
of the primary section of disk 65-2 are striped across disks 65-3, 65-4 and 65-1, 
respectively, and so on. Such backup striping is shown for the contents of the 
primary sections of disks 65-3 and 65-4. 

Detailed Description Text (11) : 
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Assume at this point that the host processor has received a request for the video 
program from a user and the host has communicated that request to each of the 
storage nodes 60-1 through 60-4 with the indication that the start of the program 
block DO is to be delivered at time tO. Further assume that a cycle is one second 
and a block of data is three (3) megabits. Upon receipt of the request, node 60-1 
generates a schedule for the unloading of the blocks of the program that are stored 
in associated disk 65-1 and stores the schedule in internal memory (not shown) . The 
schedule so generated starts with the unloading of block DO at time tO minus 1 
second for delivery to the associated server 300 multiplexer at time tO. The next 
entry, D4 is scheduled to be unloaded at time tO plus 3 seconds for delivery at tO 
plus 4 seconds. The next entry D8 is scheduled to be unloaded at time tO plus 7 
seconds for delivery at tO plus 8 seconds, and so on. An illustrative example of 
such a schedule is shown in FIG. 3. 

Detailed Description Text (12) : 

Storage node 60-2 generates a similar schedule with respect to data blocks Dl, D5, 
D9, etc. That is, a storage node is programmed so that it determines its order in 
delivering the sequence of data blocks 35 forming the requested program to the 
associated multiplexer as a function of the identity of the node 60i having the 
first block of such data. Accordingly, the schedule that node 60-2 generates will 
indicate that data block Dl, D5, D9, etc., are to be respectively delivered during 
cycles t0+l, tO+5, tO+9, and so on. Similarly, storage nodes 60-3 and 60-4 generate 
their own delivery schedules with respect to the data blocks of the requested 
program that are stored in their associated disks. 

Detailed Description Text (16) : 

If the program exits block 405 via the s no v path, then it increments (block 407) 
the number of data streams in batch i and sets a variable m to the address of node 
j. The program then assigns the user's request to batch i for transmission via bus 
m and the appropriate storage node 50m. The program (block 409) then sets C=(i+1) 
mod N and sets m=(m+l) mod N. The program (block 410) then determines if m=j and 
exits if that is the case. Otherwise, the program returns to block 408; 

Previous Doc Next Doc Go to Doc# 
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ART-UNIT: 232 

PRIMARY-EXAMINER: Bowler; Alyssa H. 
ASSISTANT-EXAMINER: Davis, Jr.; Walter D. 
ATTY-AGENT-FIRM: Perman & Green 

ABSTRACT : 

A multiprocessor system includes a plurality of substantially identical nodes 
interconnected through a switching network, each node including a disk drive, 
NVRAM, and a processor. The system stores data in either a RAID or mirrored fashion 
across a plurality of disk drives in different nodes. When data is stored in a RAID 
arrangement, an NVRAM in a parity node is provided with an entry including the new 
data, a copy of old data from the node to which the new data is to be written, a 
copy of the old parity, and a synchronization state indicator. The parity node 
determines new parity and transmits the new data to the data node for storage . Upon 
receiving an acknowledgement, the parity node resets the synchronization indicator. 
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When power-up occurs, after a power failure, the parity node scans its NVRAM for 
any entry and upon finding one with* a non-reset state indicator, transmits the new 
data to a destination data node for entry thereby synchronizing the contents of 
data and parity nodes. In a mirrored system, NVRAM in only one node has a data 
identifier entered into its NVRAM so that, upon a power failure and subsequent 
power-up, that entry enables the system to know which disk drives are in a non- 
synchronized state, and to cause actions that result in re-synchronization. 

13 Claims, 6 Drawing figures 
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Abstract Text (1): 

A multiprocessor system includes a plurality of substantially identical nodes 
interconnected through a switching network, each node including a disk drive, 
NVRAM, and a processor. The system stores data in either a RAID or mirrored fashion 
across a plurality of disk drives in different nodes. When data is stored in a RAID 
arrangement, an NVRAM in a parity node is provided with an entry including the new 
data, a copy of old data from the node to which the new data is to be written, a 
copy of the old parity, and a synchronization state indicator. The parity node 
determines new parity and transmits the new data to the data node for storage . Upon 
receiving an acknowledgement, the parity node resets the synchronization indicator. 
When power-up occurs, after a power failure, the parity node scans its NVRAM for 
any entry and upon finding one with a non-reset state indicator, transmits the new 
data to a destination data node for entry thereby synchronizing the contents of 
data and parity nodes. In a mirrored system, NVRAM in only one node has a data 
identifier entered into its NVRAM so that, upon a power failure and subsequent 
power-up, that entry enables the system to know which disk drives are in a non- 
synchronized state, and to cause actions that result in re-synchronization. 

Brief Summary Text (9) : 

The problem becomes more complex in architectures wherein multiple disks are 
written asynchronously, typically by separate controllers which can reside on 
separate processing nodes that are connected by a communication network . Where such 
disks are used for transaction processing systems, the prior art has made provision 
for using high level software transactions logs to enable resynchronization of the 
various disks, following a system failure. 

Brief Summary Text (21): 

A multiprocessor system includes a plurality of substantially identical nodes 
interconnected through a switching network, each node including a disk drive, 
NVRAM, and a processor. The system stores data in either a parity protected RAID or 
mirrored fashion across a plurality of disk drives in different nodes. When data is 
stored in a RAID arrangement, an NVRAM in a parity node is provided with an entry 
including the new data, a copy of the new parity, and a synchronization state 
indicator. The parity node determines new parity and transmits the new data to the 
data node for storage . Upon receiving an acknowledgement, the parity node resets 
the synchronization indicator. When power-up occurs, after a power failure, the 
parity node scans its NVRAM for any entry and upon finding one with a non-reset 
state indicator, transmits the new data to the destination data node for entry. In 
a mirrored system, NVRAM in only one node has a data identifier entered into its 
NVRAM so that, upon a power failure and subsequent power-up, that entry enables the 
system to know which disk drives are in a non-synchronized state. 

Drawing Description Text (3) : 

FIG. 2 is a flow diagram indicating the procedures followed by the system of FIG. 
1, in the case of mirrored-redundant data distribution. 

Drawing Description Text (4): 
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FIG. 3 is a flow diagram of the procedure followed by the system of FIG. 1 
subsequent to a power-up in a mirrored data redundancy arrangement . 

Detailed Description Text (2) : 

Referring to FIG. 1, a multiprocessor system 10 comprises a plurality of nodes 12, 
each of which is substantially identical, all such nodes interconnected via a 
switch network 14. Each node 12 includes a disk drive 16, a processor 18, RAM 20 
and an NVRAM 22. Processor 18, in the known manner, controls the operation disk 
drive 16, RAM 20, and NVRAM 22. The operation of system 10 is controlled by one or 
more nodal processors 18. The processor (s) may be located at a central controlling 
node, (e.g. node 24) or may be distributed throughout the nodal structure. Each 
node 12 must be accessible to a controlling node by means of switching network 14. 
Thus, any controlling node attempting to read or write a disk block must be in 
direct contact with all nodes in a parity group storing the block. In the 
alternative, the controlling node that attempts to read or write a disk block must 
be in contact with one of the disk nodes in the parity group, and the nodes in the 
parity group must be fully interconnected. 

CLAIMS: 

1. A multiprocessor system including a plurality of substantially identical nodes 
interconnected through a switch network, each node comprising disk drive means, 
nonvolatile random access memory (NVRAM) and a processor, said multiprocessor 
system storing RAID-structured data across disk drive means in a plurality of 
different nodes, said system performing a method comprising the steps of: 

a. listing at least an identifier of a data segment to be updated by received 
update data in an NVRAM in a first node in response to a command to write said 
update data to said data segment; 

b. sending said update data from said first node to a second node containing a copy 
of said data segment; 

c. removing said listing of said identifier in said NVRAM in said first node only 
when said update data is written to disk drive means in said first node and after 
receiving a signal that said second node has recorded said update data; 

d. causing each node, in the event of a power-up, to scan its NVRAM to find any 
listed identifiers of data segments contained therein; and 

e. for any data segment denoted by a listed identifier in said NVRAM in said first 
node, causing a corresponding data segment in said second node to be in synchronism 
with said data segment denoted by said listed identifier in NVRAM in said first 
node . 

7. A multiprocessor system including a plurality of substantially identical nodes 
interconnected through a switch network, each node comprising disk drive means, a 
nonvolatile random access memory (NVRAM) and a processor, said multiprocessor 
system storing RAID-structured data across disk drive means in a plurality of 
different nodes, said system performing a method comprising the steps of: 

responding to a command to write new data to replace old data in a data segment 
stored in a first node, by storing in NVRAM in a different node which stores parity 
data corresponding to old data stored in said first node, an entry comprising said 
new data, a state indication, and a copy of new parity as calculated based upon an 
exclusive-or combination of old data from said first node, old parity from said 
different node and said new data; 

transmitting said new data to said first node for storage therein, and upon 
receiving a signal acknowledging successful storage, causing said different-node to 
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reset said state indication; and 

causing said different node, in the event of a power-up to scan its NVRAM for a 
said entry, and upon finding a said entry with a non-reset state indication, 
transmitting said new data to said first node. 

12. A multiprocessor system including a plurality of substantially identical nodes 
interconnected through a switch network, each node comprising disk drive means, 
nonvolatile random access memory (NVRAM) and a processor, said 

multiprocessor 1 system storing RAID-structured data across a disk drive means in a 
plurality of different nodes, said system comprising: 

means for listing at least an identifier of a data segment to be updated by update 
data in an NVRAM in a first node in response to a command to write said update data 
to said data segment in said first node; 

means for sending said update data from said first node to a second node containing 
a copy of said data segment; 

means for removing said listing of said data segment in said NVRAM in said first 
node only when said update data is written to disk drive means in said first node 
and after receiving a signal that said second node has recorded said update data; 

means for causing each node, in the event of a power-up, to scan its NVRAM to find 
any listed identifiers of data segments contained therein; and 

means, responsive to finding a data segment identifier listed in said NVRAM in said 
first node, for causing a corresponding data segment in said second node to be in 
synchronism with said data segment listed in said NVRAM in said first node. 

13. A multiprocessor system including a plurality of substantially identical nodes 
interconnected through a switch network, each node comprising disk drive means, a 
nonvolatile random access memory (NVRAM) and a processor, said multiprocessor 
system storing RAID-structured data across disk drive means in a plurality of 
different nodes, said system comprising: 

means for responding to a command to write new data to replace old data in a data 
segment stored in a first node, by storing in NVRAM in a parity node which stores 
parity data corresponding to data stored in said first node, an entry comprising 
said new data, a state indication, and a copy of new parity as calculated based 
upon an exclusive-or combination of old data from said first node, old parity from 
said parity node and said new data; 

means for transmitting said new data to said first node for storage therein, and 
upon receiving a signal acknowledging successful storage, causing said parity node 
to reset said state indication; and 

means for causing said parity node, in the event of a power-up to scan its NVRAM 
for a said entry, and upon finding a said entry with a non-reset state indication, 
to transmit said new data to said first node. 
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ABSTRACT : 



This invention makes it possible for a customer computer to connect to an online 
service provider computer by phone, Internet, or other method, pay a fee to said 
service provider, and obtain additional processing and storage resources for the 
customer's computer. The resources can take the form of virtual storage and 
processing capabilities. These capabilities give the customer computer what appears 
to be additional local processing power and/or additional local storage, this 
storage possibly including preloaded software and/or data. 

The additional resources made available to the customer computer can be used either 
to enhance the customers 1 local needs (such as access to virtual storage for 
additional disk space, or access to a more powerful processor of similar type for 
program execution) , or these additional resources can be used by the customer 
computer to support services on-line that otherwise would be unavailable, 
impractical, or unaf f ordable . Examples of services include software and information 
rental, sales, and release update services, anti-viral services, backup and 
recovery services, and diagnostic and repair services, to name a few. 
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